Detecting influential observations in principal components and common principal components

نویسندگان

  • Graciela Boente
  • Ana M. Pires
  • Isabel M. Rodrigues
چکیده

Detecting outlying observations is an important step in any analysis, even when robust estimates are used. In particular, the robustified Mahalanobis distance is a natural measure of outlyingness if one focuses on ellipsoidal distributions. However, it is well known that the asymptotic chi-square approximation for the cutoff value of the Mahalanobis distance based on several robust estimates (like the minimum volume ellipsoid, the minimum covariance determinant and the S-estimators) is not adequate for detecting atypical observations in small samples from the normal distribution. In the multi-population setting and under a common principal components model, aggregated measures based on standardized empirical influence functions are used to detect observations with a significant impact on the estimators. As in the one-population setting, the cutoff values obtained from the asymptotic distribution of those aggregated measures are not adequate for small samples. More appropriate cutoff values, adapted to the sample sizes, can be computed by using a cross-validation approach. Cutoff values obtained from a Monte Carlo study using S-estimators are provided for illustration. A real data set is also analyzed. © 2010 Elsevier B.V. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A ‎n‎ew weighting approach to Non-Parametric composite indices compared with principal components analysis‎

Introduction of Human Development Index (HDI) by UNDP in early 1990 followed a surge in use of non-parametric and parametric indices for measurement and comparison of countries performance in development, globalization, competition, well-being and etc. The HDI is a composite index of three indicators. Its components are to reflect three major dimensions of human development: longevity, knowledg...

متن کامل

On convergence of sample and population Hilbertian functional principal components

In this article we consider the sequences of sample and population covariance operators for a sequence of arrays of Hilbertian random elements. Then under the assumptions that sequences of the covariance operators norm are uniformly bounded and the sequences of the principal component scores are uniformly sumable, we prove that the convergence of the sequences of covariance operators would impl...

متن کامل

Persian Handwriting Analysis Using Functional Principal Components

Principal components analysis is a well-known statistical method in dealing with large dependent data sets. It is also used in functional data for both purposes of data reduction as well as variation representation. On the other hand "handwriting" is one of the objects, studied in various statistical fields like pattern recognition and shape analysis. Considering time as the argument,...

متن کامل

Morphological Comparison of two populations of lake goby Rhinogobius similis Gill, 1859 from Hariroud basin

Knowledge on the fish species is important in habitat protection management. This study was conducted to compare the morphological characteristics of two populations of Rhinogobius similis from Hariroud basin based on landmark morphometric truss network system. A total of 60 individuals from Polkhatoun (30 specimens) and Tafrihgah dam (30 specimens) stations were caught by electrofishing 220 vo...

متن کامل

Evaluation and Geographical analysis of the principal components affecting urban economic sustainability, Case study: Cities of Chaharmahal and Bakhtiari Province

Abstract Aims & Backgrounds: Today, economic challenges are one of the most important obstacles to achieving sustainability in the cities of developing countries. Therefore, recognition and geographical analysis of the factors affecting the economic sustainability of cities are among the important goals and priorities of urban and regional planning. Methodology: This research has been done by q...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 54  شماره 

صفحات  -

تاریخ انتشار 2010